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Abstract 

In this paper, a statistical analysis of log-return fluctuations of the IPC, the 
Mexican Stock Market Index is presented. A sample of daily data covering 
the period from 04/09/2000 - 04/09/2010 was analyzed, and fitted to dif- 
ferent distributions. Tests of the goodness of fit were performed in order to 
quantitatively asses the quality of the estimation. Special attention was paid 
to the impact of the size of the sample on the estimated decay of the distri- 
butions tail. In this study a forceful rejection of normality was obtained. On 
the other hand, the null hypothesis that the log-fluctuations are fitted to a 
a-stable Levy distribution cannot be rejected at 5% significance level. 
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1. Introduction 

Today is widely acknowledged that for the proper management of assets 
and prices (and the related investment risks) it is required the proper mod- 
eling of the return distribution of financial assets. For instance, the answer 
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to whether it is possible to beat the market except by chance depends on 
whether stock market prices display long memory and how probable are very 
large price fluctuations. The crucial difficulty is, however, that the financial 
market is a very complex system; it has a large number of non-linearly in- 
teracting internal elements, and is highly sensible to the action of external 
forces. Even more, the real challenge here is that the number of the system 
constituents, and the details of their interactions and of the external factors 
acting upon it are actually barely known. 

Physicists have a long tradition of dealing with similar systems. The 
statistical description of systems of many particles was developed in parallel 
with the statistical analysis of market dynamics [H, 0, 0]. For instance, taking 
into account the wide applicability of the Central Limit Theorem, Bachelier 
assumed that the return over a given time scale is the consequence of many 
independent "microscopic" events, which then lead to a normal distribution 
of returns. Thus, he modeled their dynamics as an uncorrelated random 
walk with independent, identically Gaussian distributed random variables, 
i.e., as a Brownian motion jij. Since then, the Gaussian assumption for the 
distribution of returns has been frequently used in mathematical finance and 
is one of the key assumptions behind the classic Black-Scholes option pricing 
formula jij , which is based on a Wiener process in the continuous-time setting 
or on appropriate discrete-time versions such as binomial trees. 

Thought the simplifications the normal distribution provides in analytical 
calculation are very valuable, empirical studies 0, 0, 0, 0] show that the 
distribution of returns has a tail heavier than that of a Gaussian. To illustrate 
this fact, we show in fig JT] the histogram for daily logarithm differences of the 
Mexican IPC index from April 9th, 2000 to April 9th, 2010. Clearly, large 
events are very frequent in the data, a fact largely underestimated by a 
Gaussian process and of upmost importance for financial management. It is 
remarkable that such a feature is present in quite different markets 0, [To|, 0] . 

One of the skills of physicists is the search for universal laws, i.e., common 
features in most particular realizations of a general class of phenomena. This 
can be done even if the "microscopic" details distinguishing each case are not 
fully understood. Heavy tailed distributions are commonly described by a 
power law (at least in a range of scales), which in turn implies scale invariance, 
a distinct signature of fractals. Fractals have been shown to be a common 
geometrical pattern in many natural systems. Finally, many of these systems 
may be in a state of self-organized criticality, a paradigm that would explain 
how organization arises in complex systems and make them more predictable. 
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Figure 1: Histogram for daily logarithm differences of the Mexican IPC index from April 
9th, 2000 to April 9th, 2010. 



This could be also the case for other dynamical systems outside the realm 
of natural sciences. In his pioneering analysis of cotton prices, Mandelbrot 
[ll] (the founder of fractal geometry) observed that in addition to being non- 
Gaussian, the process of returns shows another interesting property: time 
scaling, that is, the distributions of returns for various choices of t, ranging 
from one day up to one month have similar functional forms. As it was 
already mentioned, observed stock market prices are assumed to be the sum 
of many small terms, hence a statistical model to describe them must be 
such that the sum of two independent random variables having the given 
distribution (with a parameter a describing the decay of the tail) yields 
again the same kind of distribution (with the same value of a). Motivated 
by these empirical findings and reasoning, Mandelbrot proposed that the 
returns can be modeled as a kind of stable process introduced by Levy in 
1925 [3]. Levy stable distributions are attractive because they are supported 
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by the generalized Central Limit Theorem. The theorem states that stable 
laws are the only possible limit distributions for properly normalized and 
centered sums of independent, identically distributed random variables. 

An issue here is whether the underlying distributions are actually stable. 
Stability only holds for a G (0, 2] and some authors have found that the 
tails of some financial time series have to be modeled with a > 2 0, Isl, Q |. 
Conclusive results on the distribution of returns are difficult to obtain, and 
require a large amount of data to study the rare events that give rise to the 



fat tails [12]. Another issue is that according with Cont |8J, in order for a 
parametric distributional model to reproduce the properties of the empirical 
distribution it must have at least four parameters: a location parameter, 
a scale parameter, a parameter describing the decay of the tails and an 
asymmetry parameter. And we know of several other heavy-tailed alternative 
distributions (such as student's t, hyperbolic or normal inverse Gaussian) 
which fulfill this condition. Therefore, in order to grasp the universal laws 
behind markets dynamics, it is important to keep accumulating empirical 
facts about the statistics of different financial indices around the world. 

The aim of this paper is to provide a rigourous statistical analysis of 
log-return daily fluctuations of the IPC, the leading Mexican Stock Market 
Index. After considering the Gaussian, normal inverse Gaussian and Levy 
distributions, we found that this last provides the best fit and we show that 
the corresponding a is indeed in the range for a stable Levy distribution. 

The paper is organized as follows. In the next section we review the 
basic properties of Levy-stable distributions. In section [3] we proceed to 
introduce and analize the IPC data. In section H] we discuss the similarities 
and disagreements between our results and results in previous works. Finally 
we present our conclusions in section 

2. Stable distributions 

Levy-stable distributions were introduced by Paul Levy [3] during his 
investigations of the behavior of sums of independent random variables. A 
Levy skew alpha-stable distribution is specified by scale 7, exponent a, skew- 
ness parameter f3 and a location parameter [i. Since the analytical form of 
the Levy stable distribution is known only for a few cases, they are generally 
specified by their characteristic function. The most popular parameteriza- 



tion is defined by Samorodnitsky, G. and Taqqu, [13j with the characteristic 
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function: 

^ = fexp(-7|t| [l+^|sign(t)ln(|t|) + i^t]), if a = 1. 

jexp (— 7 a |t| a [l — z/3tan (^) sign(t) -M/it]), otherwise. 

where sign(t) stands for the sign of t. Then, the probability density function 
is calculated from with the inverse fourier transform in the form: 

i r+oo 

f{x;a,P,j,n) = — <P(t)e- itx dt. (2) 

Levy distributions are characterized by the property of being stable under 
convolution, i.e, the sum of two independent and identically Levy-distributed 
random variables, is also Levy distributed with the same stability index a. 
The stability parameter a lies in the interval (0,2]. Small a represents a 
sharp peak but heavy tails which asymptotically decay as power laws with 
exponent — (a + 1). For the normal distribution a = 2. For the symmetric 
distributions (like the normal distribution), the skewness parameter /3 — 0. 
The skewness parameter must lie in the range [—1,1]. When (3 = +1,-1, 
one tail vanishes completely. The parameter 7 lies in the interval (0,oo), 
while the location parameter /x is in (—00, +00). 

The asymptotic behaviour of the Levy distributions is described by the 
expression 

f(x-a)^\x\- l - a . (3) 

Hence, the variance of the Levy stable distributions is infinite for all a < 2. 
The dependence on a is illustrated in the semilog plots in figure [21 

In our computations the stable library developed by Nolan was used. 



3. Statistical analysis of data 

Our IPC data set covers the period from April, 2000 to April, 2010, and 
comprises 2502 returns. The index was label as Y(t) and the data is written 
as the successive differences of the natural logarithm of the returns, 

S{t) = In Y(t + At) - In Y(t) . (4) 

The daily closure values of the IPC were used, so At = 1 day. Fig. E] displays 
the index value as a function of trading day and the corresponding histogram 
for S(t) is presented in figure [Q In table [1] the mean, standard deviation, 
skewness and kurtosis are reported for the IPC log-return time series. The 
high value of the kurtosis (7.23) shows that the density functions of the time 
series is more peaked than in Gaussian distributions. 



5 




6 



Index 


Mean 


Standard Deviation 


Skewness 


Kurtosis 


IPC 


-0.609 x 10" 3 


0.151 x 10" 1 


-0.395 x 10" 1 


7.23 



Table 1: Mean variance, skewness and kurtosis of the IPC log- return time series. 



3.1. Gaussian fit 

The method of maximum likelihood was used to determine the parameters 
of the Gaussian distribution that fits the experimental data. The results are 
shown in table [2J where \i is the mean and a the standard deviation. In 



Parameters 


a 


(3 


7 




a 


Gaussian fit 








-6.09 x 10" 4 


0.0151 


a-stable fit 


1.64 


0.219 


0.00815 


-0.000186 




NIG fit 


55.43 


-0.2990 


0.01254 


-0.000541 





Table 2: Estimated parameters of the normal, a-stable and NIG distributions. 



figure H] the empirical PDF for the daily data is shown along with the fitted 




Figure 4: Fits of daily logarithm differences to the Gaussian distributions and a-stable 
Levy. 

Gaussian pdf. As it can be seen, it is not a good fit, especially in the tails of 
the distribution. 

Now we will verify the hypothese, H Q , that the sample comes from a 
Gaussian distribution with the above estimated parameters. Using the n = 
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2502 sample size, the hypothese is examined by three criteria: The chi-square 
goodness of fit test, the Anderson-Darling test and the Kolmogorov-Smirnov 
(K-S) method. 

For the Pearson test, the experimental statistics is x 2 = 161.9 with 2 
degrees of freedom, then the null hypothesis Hq for this case (sample comes 
from a normal distribution) can be rejected at 5% significance level. In turn, 
the Anderson-Darling test yield an experimental statistics of 26.6115 and a 
zero p-value, clearly rejecting Hq at 5% significance level. 

The experimental statistics for the K-S test can be obtained by arranging 
the data in ascending order (x±, X2, ■ ■ • , x n ), and deriving the maximum dif- 
ference between the rank statistics (i — l)/n and the theoretically calculated 
cdf : 

( 



D = max I max 



. . i — 1 

Fx,) 



n 



- ~ F(x t ) 
n 



(5) 



Given that the parameters of the fitting distributions were estimated from the 
observed data, limiting values provided by the K-S criteria cannot be used. In 
this case, approximate p- values can be obtained via Monte Carlo simulations 
(l4| . First, the parameter vector is estimated for a given sample of size n, 
9 = (d, (3, 7, fi) being the result, and the test statistics is calculated assuming 
that the sample is distributed according to F(x; 9), returning a value of D. 
Next, a sample of size n F(x; 9) variates is generated and the parameter 
vector 9\ is estimated. The test statistics is again calculated assuming that 
the sample is distributed according to F{x\ 9\). 

Such a calculation was made for n = 2502 (our sample size) 1000 times, 
and the distribution pattern of Di was derived. Then, 5% percent point from 
the greater side was taken as the estimated limiting values. The estimate of 
p-value is calculated as the relative number of occasions in which the test 
statistics is at least as large as D. The test statistics and the corresponding 
estimated p- values are shown in table [3) At 5% significance, the experimental 
statistics can be compared with the 95% limiting values obtained from the 
Monte Carlo simulations. The values of the test statistics (0.0651) yield 
p-values smaller than 0.05 forcing us, again, to reject the null hypothesis. 
Summarizing, the three tests we have used reject the Gaussian distribution 
as the underlying statistics. 

3.2. Alpha-stable fit 

Using the method of maximum likelihood, the parameters of the Levy 
skew a-stable distribution that fits the experimental data were found. Their 
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Case 


K-S Test 
Statistics 


K-S (estimated) 
limiting values 
for a = 0.05 


p- value 


Reject HqI 


Gaussian 


0.0651 


0.0184 


0.000 


Yes 


a-stable Levy 


0.0165 


0.0209 


0.133 


No 


NIG 


0.0272 


0.0207 


0.01 


Yes 



Table 3: Test statistics, K-S criterion limiting values and the corresponding p- values based 
on 1000 simulated samples for the normal, a-stable Levy and NIG fits. 



values are presented in table |2l where a = 1.64 is the index, /3 = 0.219 
the skewness parameter, 7 = 0.00815 a scale factor, and fi = —0.000186 a 
location parameter. The plot of the fitted Levy pdf presented in figure H] 
shows good agreement with the empirical distribution. Now we will verify 
the hypothese, H , that the sample comes from a Levy distribution with the 
above estimated parameters. Since the Anderson-Darling p-values for this 
distribution are not available, we will use the chi-square goodness of fit test, 
and the K-S method. 

The Pearson statistics is x 2 = 16.27 with 15 degrees of freedom. The 
probability \ 2 > \ 2 was found to be 0.636. Therefore, at 5% significance 
level, the null hypothesis Hq (sample comes from a-stable Levy distribution) 
cannot be rejected. 

The KS test statistics and the corresponding estimated p-values for the 
a-stable Levy distribution are shown in table [3j Again, at 5% significance, 
the experimental statistics can be compared with the 95% limiting values 
obtained from the Monte Carlo simulations. For the Levy fit, the exper- 
imental statistics is D = 0.0165, which is smaller than the limiting value 
D\- a = 0.0209. Therefore, the null hypothesis Hq (sample comes from a- 
stable Levy distribution) cannot be rejected at the 5% significance level. The 
corresponding p- value obtained from the Monte Carlo simulation was found 
equal to 0.133. 

It is worthy to note here that the statistics and p-values obtained via 
Monte Carlo simulations are quite different from those commonly used as 
reference values is standarized test for normality. This supports that the 
correct procedure is to estimate these values in a case-by-case basis. 
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3.3. Normal-inverse Gaussian fit 

The hyperbolic distribution also provides the possibility of modeling fat 
tails. The normal-inverse Gaussian (NIG) distributions were introduced as a 
subclass of the generalized hyperbolic laws (GH), and were developed [15j for 
modeling the grain size distribution of windblown sand. Empirical studies 
16 ] suggest a good fit of the NIG law to financial data. In fact, in Ref.(l7| 



it was found that the NIG distribution fits nicely the IPC data and other 
Mexican finantial indices. Therefore, we will also fit the data to a NIG and 
compare our results with the a-stable and Gaussian fits. 

The GH distribution is specified by scale 7, exponent a, skewness pa- 
rameter (3, a location parameter /1, and a parameter A that characterizes the 
subclasses of the GH distribution. Its probability density function is given 
by: 

f HG {x- a, 0, 6, X, fi) = k [6 2 + (x- fi) 2 } K x _i (ay/ 8* + (x - M ) 2 ) 

(6) 

where: 

k = — - 1 L . (7) 

y/2^a x -*6 x K (5y/a 2 - P 2 ) 

The NIG distribution corresponds to A = —1/2 and is able to model symmet- 
ric and asymmetric distributions with possibly long tails in both directions. 
It has the probability density function: 

W*;<.,AM = ^ e V^<^-' ( °f ' + ( * ~f ] . (8) 

Our fit to the IPC data yields a = 55.43, = -0.2990, S = 0.01254 and /x = 
—0.000541 (see table |2]). The Kolmogorov-Smirnov experimental statistics 
(JSj) was found to be D = 0.0272, which is larger than the 95% limiting value 
(-D1-0.05 = 0.0207) obtained from a 1000 realization Monte Carlo simulation. 
For this case, the values of the test statistics (0.01) yield p-values smaller 
than 0.05 forcing us to reject the null hypothesis for a 5% significance level. 
It must be mentioned, nevertheless, that the null hypothesis (sample comes 
from a NIG distribution) could not be rejected at the 1% significance level. 

4. Comparison with previous studies and discussion 



In [llj a sample of IPC covering the period 13- year period 04/19/1990 — 
—08/21/2003 was analyzed. The authors claimed that the cumulative dis- 
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tribution function for extreme variations can be described by a Pareto-Levy 
model with shape parameters a = 3.634 ± 0.272 and a = 3.540 ± 0.278 for 
its positive and negative tail respectively. As a consequence they concluded 
that the process that governs the time series is well outside the Levy regime. 
However, as was reported in [10], a value of the tail exponent of about 3 may 
very well indicate a Levy-stable distribution with a about 1.8. Hence, the re- 
sults obtained in ll| could be a consequence of working with small samples. 
Following 12], we generated samples of size N = 10 3 , 2502, 10 4 , 10 5 , 10 6 and 
10 7 of Levy-stable distributed random variables with parameters a = 1.7 
(which is close to a = 1.64 obtained for our IPC data), (3 = 0, 7 = 1 and 
\i = 0. Then, the tail a index was estimated [12] with the method developed 
in [lo| that combines maximum-likelihood fitting methods with goodness of 
fit tests based on the Kolmogorov-Smirnov statistics and likelihood ratios. 
The results of the simulations are displayed in table HI As can be observed 



Number of samples 


a 


10 3 


2.166 


2502 


2.137 


10 4 


1.853 


10 5 


1.771 


10 6 


1.613 


10 7 


1.714 



Table 4: Estimated a as funtion of the samples size. 



in table HI for N = 2502 a value of a = 2.137 was obtained which is also out- 
side the Levy regime. These simulations suggest that high frequency data is 
needed in order to obtain reliable results while estimating the tail index[12]. 

The results obtained in section 3.1 are in general in agreement with previ- 
ous results obtained in [17J, where a better fit than the Gaussian was obtained 
for the NIG distribution. In that paper the data set under analysis covered 
the period January, 1993- January 2003, and comprises 2505 returns. The KS 
goodness of fit analysis was performed at the 1% significance level. A clear 
rejection of normality was obtained by those authors. At this significance 
level their fit of the data using a NIG was not rejected. Nevertheless, our 
results clearly show that the a-stable distribution provides a better fit than 
the NIG distribution for daily fluctuations of the IPC data. 



11 



5. Conclusions 

The three tests we have used forceful rejected that the data of the Mexican 
financial index, IPC, is normally distributed. Likewise, at the 5% significance 
level, the corresponding tests force us to reject the null hypothesis of the data 
being distributed as a normal inverse Gaussian. On the other hand, the null 
hypothesis that our sample comes from cc-stable Levy distribution cannot be 
rejected at the 5% significance level. Our analysis of the impact of the sample 
size on the estimation of the tail index confirms that high frequency data is 
needed in order to determine whether or not a given distribution is stable. Of 
the three distributions we have considered, the a-stable distribution provides 
the best fit for daily fluctuations of the IPC data. 
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